Least Ambiguous Set-Valued Classifiers with Bounded Error Levels

نویسندگان

  • Mauricio Sadinle
  • Jing Lei
  • Larry A. Wasserman
چکیده

In most classification tasks there are observations that are ambiguous and therefore difficult to correctly label. Set-valued classification allows the classifiers to output a set of plausible labels rather than a single label, thereby giving a more appropriate and informative treatment to the labeling of ambiguous instances. We introduce a framework for multiclass set-valued classification, where the classifiers guarantee user-defined levels of coverage or confidence (the probability that the true label is contained in the set) while minimizing the ambiguity (the expected size of the output). We first derive oracle classifiers assuming the true distribution to be known. We show that the oracle classifiers are obtained from level sets of the functions that define the conditional probability of each class. Then we develop estimators with good asymptotic and finite sample properties. The proposed classifiers build on and refine many existing single-label classifiers. The optimal classifier can sometimes output the empty set. We provide two solutions to fix this issue that are suitable for various practical needs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nonlinear Viscosity Algorithm with Perturbation for Nonexpansive Multi-Valued Mappings

In this paper, based on viscosity technique with perturbation, we introduce a new non-linear viscosity algorithm for finding a element of the set of fixed points of nonexpansivemulti-valued mappings in a Hilbert space. We derive a strong convergence theorem for thisnew algorithm under appropriate assumptions. Moreover, in support of our results, somenumerical examples (u...

متن کامل

Similarity, Cardinality and Entropy for Bipolar Fuzzy Set in the Framework of Penta-valued Representation

In this paper one presents new similarity, cardinality and entropy measures for bipolar fuzzy set and for its particular forms like intuitionistic, paraconsistent and fuzzy set. All these are constructed in the framework of multi-valued representations and are based on a penta-valued logic that uses the following logical values: true, false, unknown, contradictory and ambiguous. Also a new dist...

متن کامل

Boosting SVM Classifiers with Logistic Regression

The support vector machine classifier is a linear maximum margin classifier. It performs very well in many classification applications. Although, it could be extended to nonlinear cases by exploiting the idea of kernel, it might still suffer from the heterogeneity in the training examples. Since there are very few theories in the literature to guide us on how to choose kernel functions, the sel...

متن کامل

(Not) Bounding the True Error

We present a new approach to bounding the true error rate of a continuous valued classifier based upon PAC-Bayes bounds. The method first constructs a distribution over classifiers by determining how sensitive each parameter in the model is to noise. The true error rate of the stochastic classifier found with the sensitivity analysis can then be tightly bounded using a PAC-Bayes bound. In this ...

متن کامل

Compact composition operators on real Banach spaces of complex-valued bounded Lipschitz functions

We characterize compact composition operators on real Banach spaces of complex-valued bounded Lipschitz functions on metric spaces, not necessarily compact, with Lipschitz involutions and determine their spectra.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1609.00451  شماره 

صفحات  -

تاریخ انتشار 2016